A Comment on the ROC Curve and the Area Under it as Performance Measures
نویسنده
چکیده
The Receiver Operating Characteristic (ROC) curve is a two dimensional measure of classification performance. The area under the ROC curve (AUC) is a scalar measure gauging one facet of performance. In this note, five idealized models are utilized to relate the shape of the ROC curve, and the area under it, to features of the underlying distribution of forecasts. This allows for an interpretation of the former in terms of the latter. The analysis is pedagogical in that many of the findings are already known in more general (and more realistic) settings; however, the simplicity of the models considered here allows for a clear exposition of the relation. For example, although in general there are many reasons for an asymmetric ROC curve, the models considered here clearly illustrate that for symmetric distributions, an asymmetry in the ROC curve can be attributed to unequal widths of the distributions. Also, for bounded forecasts, e.g., probabilistic forecasts, any asymmetry in ROC can be explained in terms in terms of a simple combination of the means and widths of the distributions. Furthermore, it is shown that AUC discriminates well between “good” and “bad” models, but not between “good” models.
منابع مشابه
تشخیص بهتر سلامت رانندگان با بهره گیری از شبکه عصبی مصنوعی
Introduction: Uncontrolled health status of drivers, can lead to the death of healthy individuals who are living in their best periods of life in terms of performance and wellness and also it can impose huge financial costs on a country. The purpose of this study was to design an intelligent system using Multilayer perceptron (MLP) and radial basis function (RBF) neural networks in order to dia...
متن کاملComparison of Gestational Diabetes Prediction Between Logistic Regression, Discriminant Analysis, Decision Tree and Artificial Neural Network Models
Background and Objectives: Gestational Diabetes Mellitus (GDM) is the most common metabolic disorder in pregnancy. In case of early detection, some of its complications can be prevented. The aim of this study was to investigate early prediction of GDM by logistic regression (LR), discriminant analysis (DA), decision tree (DT) and perceptron artificial neural network (ANN) and to compare these m...
متن کاملPredicting The Type of Malaria Using Classification and Regression Decision Trees
Predicting The Type of Malaria Using Classification and Regression Decision Trees Maryam Ashoori1 *, Fatemeh Hamzavi2 1School of Technical and Engineering, Higher Educational Complex of Saravan, Saravan, Iran 2School of Agriculture, Higher Educational Complex of Saravan, Saravan, Iran Abstract Background: Malaria is an infectious disease infecting 200 - 300 million people annually. Environme...
متن کاملPrediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis
Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods : In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were ...
متن کاملکانونهای برداشت گرد و غبار، مدل درخت رگرسیون تقویت شده، پهنهبندی حساسیت، شرق ایران
Due to the drought and land use changes in recent years, the phenomenon of storm dust in Iran is increasing as a dangerous environment. Dust influences climate change and human health, causing serious damage. The subject of this research is to identify and prepare a map of sensitivity of dust source area for controlling and determining the role of each of the factors affecting its occurrence us...
متن کامل